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method utilizes fusion protein constructs containing a DNA binding domain and complementary dimerization domains from a 
different protein. According to the method of the invention, protein partner heterodimer formation is detected by the ability of the 
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Protein Partner Screening Assays and Uses Thereof 



Field of the Invention 

This invention is in the field of molecular biology and is directed to a 
10 method of identifying a peptide capable of associating with another peptide in 

a heterodimeric complex. The invention is also directed to a method of 
identifying inhibitors of such heterodimeric complex formation. 

Background of the Invention 

Many regulatory proteins are heterodimers, that is, they are composed 
15 of two different peptide chains which interact to generate the native protein. 

Among such regulatory proteins are DNA binding proteins which are 
capable of binding to specific DNA sequences and thereby regulating 
transcription of DNA into RNA. The dimerization of such proteins is 
necessary in order for these proteins to exhibit such binding specificity. A 
20 large number of transcriptional regulatory proteins have been identified: Myc, 

Fos, Jun, Ebp, Fra-1, Jun-B, Spl, H2TF-l/NF-jcB-like protein, PRDI, TDF, 
GLI, Evi-1, the glucocorticoid receptor, the estrogen receptor, the 
progesterone receptor, the thyroid hormone receptor (c-erbA) and ZIF/268, 
OTF-l(OCTl), OTF-2(OCT2) and PIT-1; the yeast proteins GCN4, GAL4, 
25 HAP1, ADR1, SWI5, ARGRII and LAC9, mating type factors MATal, 
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MATa2 and MATal; the Neurospora proteins cys-3 and possibly cpc-1; and 
the Drosophila protein bsg 25D, kruppel, snail, hunchback, serendipity, and 
suppressor of hairy wing, antennapedia, ultrabithorax, paired, fushi tarazu, 
cut, and engrailed. Eukaryotic transcriptional regulatory proteins, and the 
5 methods used to characterize such proteins, have been recently reviewed 

(Pabo, CO. etaL, Ann. Rev. Biochem. 61:1053-1095 (1992); Johnson, P. F. 
et aL, Ann. Rev. Biochem. 58:799-839 (1989)). 

Members of the mammalian transcriptional regulatory protein families 
Jun/Fos and ATF/CREB only bind to DNA as dimers. The proteins in these 

10 families are "leucine zipper" proteins which contain a region rich in basic 

amino acids followed by a stretch of about 35 amino acids which contains 4-5 
leucine residues separated from each other by 6 amino acids (the "leucine 
zipper" region). Collectively, the combination of a basic region and the 
leucine zipper region is termed the bZIP domain. 

15 Generally, it is the basic region which has been found to be 

predominantly involved in contacting DNA whereas the zipper region mediates 
the dimerization. Many dimeric combinations are possible, however, the 
particular nature of the zipper specifies which partnerships are permissible 
(Abel, T. et aL, Nature 341:24-25 (1989)). 

20 Another large family of proteins contains the DNA 

binding/dimerization motif known as the basic helix-loop-helix motif (bHLH) 
(Jones, N., Cell 67:9-11 (1990)). A bHLH protein generally contains a basic 
N-terminus followed by a helix-loop-helix structure; two short amphipathic 
helices containing hydrophobic residues at every third or fourth position. The 

25 sequence of the basic region characteristically reveals no indication of an 

amphipathic helix. The intervening loop region usually contains one or more 
helix-breaking residues. 

The bHLH motif was first detected in two proteins, E12 and E47, that 
bind to a specific "E box" DNA enhancer sequence found in immunoglobulin 

30 enhancers (Murre C. etaL, Cell 56:111-7%$ (1989)). E motifs generally are 

double stranded variants of the 5 -CAGGTGGC-3' consensus sequence. For 
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example, the pEl motif is GTCAAGATGGC [Seq. ID NO. 1], /xE2 motif is 
AGCAGCTGGC [SEQ ID NO. 2], /xE3 is GTCATGTGGC [Seq. ID NO. 3], 
liE is TGCAGGTGT (Murre, C. et al. f Cell 56:777-783 (1989)). Like many 
transcriptional factors, peptides containing the bHLH motif often dimerize with 
each other, either as a homodimer which contains two identical peptides or as 
a heterodimer which contains two different peptides. Examples of 
heterodimeric complex of two bHLH proteins binding DNA with a greater 
efficiency than homodimeric complexes of either peptide in the heterodimer 
are known (Murre C. et aL, Cell 56:777-783 (1989); Murre, C et aL, Cell 
55:537-544 (1989)). 

Identification of partners which direct protein-DNA binding and 
compounds which inhibit such activity by inhibiting such protein partner 
interaction could be very useful. For example, identification of partners of the 
myc protein and inhibitors of myc-partner interactions could provide a means 
for treating diseases in which expression and activity of myc is a factor in 
promoting cell growth or in maintaining the cell in a transformed state. 

Myc is a bHLH protein and the bHLH domain of c-myc is encoded in 
c-myc amino acids 354-411. The sequence homology between the proteins 
expressed by the three myc genes (human N-myc 393-437, human c-myc 354- 
411, and human L-myc 289-338) and other genes which contain a bHLH 
domain have been compared (Murre C. et aL, Cell 56:111 -7 83 (1989)). 

Proteins such as myc which contain the bHLH motif also possess the 
ability to dimerize with other bHLH motif proteins. Such interactions among 
bHLH proteins may play a critical role in their function and/or regulation. 
Identification of these protein partners would be useful not only in 
understanding how these proteins function, but also in developing or 
identifying inhibitors of these proteins. For example, identification of myc- 
partners would make it possible to identify inhibitors of myc-partner 
interactions. By inhibiting such interactions, inhibition and/or control of myc- 
induced cell growth may be achieved. 
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To date, no myc inhibitors have been identified. The identification of 
such inhibitors has suffered for lack of a simple, inexpensive and reliable 
screening assay which could rapidly identify potential inhibitors and active 
derivatives thereof. Thus a need still exists for rapid, economical screening 
5 assays which identify specific inhibitors of oncogene activity. 

Summary of the Invention 

Recognizing the potential importance of inhibitors of oncoproteins in 
the therapeutic treatment of many forms of cancer, and cognizant of the lack 
of a simple assay system in which such inhibitors might be identified, the 
10 inventors have investigated the use of chimeric oncogene constructs in in vitro 

assays in prokaryotic hosts as a model system for identifying agents which 
alter oncogene expression. 

These efforts have culminated in the development of a simple, 
inexpensive assay which can be used to identify protein partners in general, 
15 and partners of transcriptional regulatory proteins in particular. 

The methods of the invention are especially useful for the identification 
of partners which influence transcriptional regulatory proteins, and especially 
oncoprotein activity. 

The method of the invention further provides a method of identifying, 
20 isolating and characterizing inhibitors of such partner formation and especially 

inhibitors of oncoprotein activity. 

The invention further provides a quick, reliable and accurate method 
for objectively classifying compounds, including human pharmaceuticals, as 
inhibitors of oncogene activity. 
25 The invention further provides a method of identifying protein partners 

by their ability to disrupt \cl induced repression of phage promoters in 
bacterial hosts which express fusion proteins containing the cl DNA binding 
domain and a dimerization domain from a protein of interest. Proteins 
identified by this method are partners of the protein from which the 
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dimerization domain was obtained. Protein partners thus identified are already 
in a cloned form, amenable to further characterization. 

Brief Description of the Drawings 

Figure 1 shows the DNA sequence (Seq. ID No. 4) and protein 
5 sequence (Seq. ID No. 5) of human c-myc exon 3 and the sites used to 

synthesize the HLH/LZ and HLH fragments of c-myc. 

Description of the Preferred Embodiments 

In the description that follows, a number of terms used in recombinant 
DNA technology are extensively utilized. In order to provide a clearer and 
10 more consistent understanding of the specification and claims, including the 

scope to be given such terms, the following definitions are provided in 
alphabetical order. 

Bioactive Compound . The term "bioactive compound" is intended to 
refer to any compound which induces a measurable response in the assays of 
15 the invention. 

Cloning vehicle. A "cloning vehicle" is any molecular entity which is 
capable of providing a nucleic acid sequence to a host cell for cloning 
purposes. Examples of cloning vehicles include plasmids or phage genomes. 
A plasmid which can replicate autonomously in the host cell is especially 
20 desired. Alternatively, a nucleic acid molecule which can insert into the host 

cell's chromosomal DNA is especially useful. 

Cloning vehicles are often characterized by one or a small number of 
endonuclease recognition sites at which such DNA sequences may be cut in 
a determinable fashion without loss of an essential biological function of the 
25 vehicle, and into which DNA may be spliced in order to bring about its 

replication and cloning. 
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The cloning vehicle may further contain a marker suitable for use in 
the identification of cells transformed with the cloning vehicle. Markers, for 
example, are tetracycline resistance or ampicillin resistance. The word 
"vector" is sometimes used for "cloning vehicle. " 

5 Compound . The term "compound" is intended to refer to a chemical 

entity, whether in the solid, liquid, or gaseous phase. The term should be read 
to include synthetic compounds, natural products and macromolecular entities 
such as polypeptides, polynucleotides, or lipids, and also small entities such 
as neurotransmitters, ligands, hormones or elemental compounds. 

10 Dimeric Protein . The term "dimeric protein" is intended to refer to a 

protein which contains two polypeptide chains that associate with one another, 
but which are not bound to one another by an amino acid linkage. Association 
of the polypeptide chains may be due to, for example, hydrogen bonding, 
ionic interactions, hydrophobic interactions, disulfide bonds, and the like. 

15 Dimerization Domain . The term "dimerization domain" is intended to 

refer to that portion of each polypeptide chain of a dimeric protein which is 
necessary for the polypeptide chains to associate with one another. The 
dimerization domains of a dimeric protein, which may be identical or 
different, are referred to herein as complimentary to each other. 

20 Expression . Expression is the process by which the information 

encoded within a gene is transcribed and translated into protein. 

A nucleic acid molecule, such as a DNA or gene is said to be "capable 
of expressing" a polypeptide if the molecule contains the sequences which 
code for the polypeptide and the expression control sequences which, in the 

25 appropriate host environment, provide the ability to transcribe, process and 

translate the genetic information contained in the DNA into a protein product, 
and if such expression control sequences are operably-linked to the nucleotide 
sequence which encodes the polypeptide. 

Expression vehicle . An "expression vehicle" is a vehicle or vector 

30 similar to a cloning vehicle but is especially designed to provide sequences 

capable of expressing the cloned gene after transformation into a host. 
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In an expression vehicle, the gene to be cloned is operably-linked to 
certain control sequences such as promoter sequences. 

Expression control sequences will vary depending on whether the 
vector is designed to express the operably-linked gene in a prokaryotic or 
5 eukaryotic host and may additionally contain transcriptional host specific 

elements such as operator elements, upstream activator regions, enhancer 
elements, termination sequences, tissue-specificity elements, and/or 
translational initiation and termination sites. 

Functional Derivative . A "functional derivative" of a fusion protein is 
10 a protein which possesses an ability to dimerize with a partner protein, and/or 

an ability to bind to a desired DNA target, that is substantially similar to the 
ability of the fusion protein constructs of the invention to dimerize. By 
"substantially similar" is meant that the above-described biological activities 
are qualitatively similar to the fusion proteins of the invention but 
15 quantitatively different. For example, a functional derivative of a fusion 

protein might recognize the same target as the fusion protein, or form 
heterodimers with the same partner protein, but not with the same affinity. 

As used herein, for example, a peptide is said to be a "functional 
derivative" when it contains the amino acid sequence of the fusion protein plus 
20 additional chemical moieties not usually a part of a fusion protein. Such 

moieties may improve the derivative's solubility, absorption, biological half- 
life, etc. The moieties may alternatively decrease the toxicity of the 
derivative, or eliminate or attenuate any undesirable side effect of the 
derivative, etc. Moieties capable of mediating such effects are disclosed in 
25 Remington's Pharmaceutical Sciences (1980). Procedures for coupling such 

moieties to a molecule are well known in the art. 

A functional derivative of a fusion protein may or may not contain 
post-translational modifications such as covalently linked carbohydrate, 
depending on the necessity of such modifications for the performance of the 
30 methods of the invention. 
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The term "functional derivative" is intended to encompass functional 
"fragments, " "variants, " "analogues, " or "chemical derivatives" of a molecule. 

Fusion protein . As used herein, "fusion protein" is a hybrid protein 
which has been constructed to contain domains from two different proteins. 
5 The term "fusion protein gene" is meant to refer to a DNA sequence 

which codes for a fusion protein, including, where appropriate, the transcrip- 
tional and translational regulatory elements thereof. 

Heterodimer . The term "heterodimer" or "heterodimeric protein" is 
intended to refer to a protein which contains two different polypeptide chains 
10 that associate with one another, but which are not bound to one another by an 

amino acid linkage. 

Homodimer . The term "homodimer" or "homodimeric protein" is 
intended to refer to a protein which contains two identical polypeptide chains 
that associate with one another, but which are not bound to one another by an 
15 amino acid linkage. This term may be modified to refer only to a particular 

portion of a dimeric protein. For instance, a DNA binding domain 
homodimer is intended to refer to any dimeric protein containing identical 
DNA binding domains on its separate polypeptide chains. 

Host . By "host" is meant any organism that is the recipient of a 
20 cloning or expression vehicle as defined herein. Appropriate hosts for use in 

the method of the invention include, but are not limited to, bacteria, yeast, and 
mammalian cells. 

Marker Gene . The term "marker gene" is intended to refer to a gene 
whose expression in a host cell produces a readily observable, assayable, or 
25 selectable phenotype. Examples of marker genes which may be useful in the 

method of the invention include, but are not limited to, /acZ, aada (which 
confers spectinomycin and streptomycin resistance), and ble-1 (which confers 
bleomycin and phleomycin resistance). 

Qperablv-linked . As used herein, two macromolecular elements are 
30 operably-linked when the two macromolecular elements are physically 
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arranged such that factors which influence the activity of the first element 
cause the first element to induce an effect on the second element. 

Promoter. A "promoter" is a DNA sequence located proximal to 
the start of transcription at the 5' end of the transcribed sequence, at which 
5 RNA polymerase binds or initiates transcription. The promoter may contain 

multiple regulatory elements that interact in modulating transcription of the 
operably-linked gene. 

Protein Partner. The term "protein partner" is intended to refer to a 
polypeptide chain capable of associating with a heterologous polypeptide chain 
10 to form a heterodimeric protein. The two polypeptide chains of a 

heterodimeric protein are herein referred to as "partners" of one another. A 
polypeptide chain of a homodimeric protein may act as a partner in a 
. heterodimeric protein. 

Response. The term "response" is intended to refer to a change in any 
15 parameter which can be used to measure and describe the effect of a 

compound on the activity of a protein. The response may be revealed as a 
physical change (such as a change in phenotype) or a molecular change (such 
as a change in a reaction rate or affinity constant). Detection of the response 
may be performed by any means appropriate. 
20 Variant. A "variant" of a fusion protein is a protein which 

contains an amino acid sequence that is substantially similar to, but not 
identical to, the amino acid sequence of a fusion protein constructed from 
naturally-occurring domains, that is, domains containing the native with the 
amino acid sequence. 

15 By a "substantially similar" amino acid sequence is meant an amino 

acid sequence that is highly homologous to, but not identical to, the amino 
acid sequence found in a fusion protein. Highly homologous amino acid 
sequences include sequences of 80% or more homology, and possibly lower 
homology, esjpecially if the homology is concentrated in domains of interest. 

10 Transcription regulatory proteins, which normally function as dimeric 

proteins, have been found to possess discrete dimerization domains and DNA 
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binding domains. The inventors have used these findings to develop the 
method of the invention for identifying a partner of a dimeric protein. This 
method involves construction of chimeric peptides with (1) known 
complementary dimerization domains and (2) DNA binding domains which, 
5 when present in homodimer form, are capable of conferring a detectable 

phenotype upon a host cell (preferably a bacterial host cell, such as £*. coli). 

This detectable phenotype is a marker other than resistance to phage 
infection, such as infection by lambda phage. It has been discovered that this 
phenotype may be detected by methods which do not depend upon phage 
10 resistance. 

In the host cell, the chimeric peptides form DNA binding domain 
homodimers by association of the known complementary dimerization 
domains. Protein partners capable of associating with the chimeric peptides 
to form heterodimeric proteins will interfere with formation of the chimeric 
15 peptides into DNA binding domain homodimers. By monitoring the 

homodimer-conferred phenotype in the host cell, formation of interfering 
heterodimers may be detected and protein partners thus identified. 

This method of the invention is generally useful to identify partners for 
any homodimer or heterodimer. For a homodimer, a single chimeric peptide 
20 containing the dimerization domain of the homodimer is used. For a 

heterodimer, two separate chimeric peptides are used; each containing one of 
the complementary dimerization domains of the heterodimer. The chimeric 
peptides also contain a DNA binding domain that confers a detectable 
phenotype in homodimer form. 
25 DNA binding domains useful in construction of chimeric peptides of 

the invention may be obtained from proteins where they have been identified. 
For example, DNA binding domains may be obtained from bacteriophage 
repressors, such as bacteriophage lambda (X) repressor. In particular, the 
lambda repressor protein cl is useful as a source of a DNA binding domain. 
30 cl represses lambda gene expression in its homodimeric form (Lambda //, 
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Hendrix, R.W. et ah, eds., Cold Spring Harbor Laboratory, New York, 
(1983). 

Other DNA binding domains may be identified by a variety of 
techniques known in the art and previously used to identify such domains (see 
Pabo, CO. etal, Ann. Rev. Biochem. 61:1053-1095 (1992); Johnson, P. E. 
et ai, Annu. Rev. Biochem. 58:799-839 (1989) for a review of such domains). 

DNA binding proteins, and DNA binding domains in such proteins, are 
identified and purified by their affinity for DNA. For example, DNA binding 
may be revealed in filter hybridization experiments in which the protein 
(usually labelled to facilitate detection) is allowed to bind to DNA immobilized 
on a filter or, vice versa, in which the DNA binding site (usually labelled) is 
bound to a filter upon which the protein has been immobilized. The sequence 
specificity and affinity of such binding is revealed with DNA protection assays 
and gel retardation assays. Purification of such proteins may be performed 
utilizing sequence-specific DNA affinity chromatography techniques, that is, 
column chromatography with a resin derivatized with the DNA to which the 
domain binds. Proteolytic degradation of DNA binding proteins may be used 
to reveal the domain which retains the DNA binding ability. 

Dimeric proteins for which protein partners are desired to be identified 
serve as the source of dimerization domains useful in the construction of 
chimeric peptides of the invention. Dimerization domains may be currently 
known dimerization domains or those recognized by their homology to known 
dimerization domains. Other dimerization domains may be predicted by 
analysis of the three-dimensional structure of a protein using the amino acid 
sequence and computer analysis techniques commonly known in the art, for 
example, the Chou-Fasman algorithm. Such techniques allow for the 
identification of helical domains and other areas of interest, for example, 
hydrophobic or hydrophilic domains, in the peptide structure. 

One class of known dimerization domains are the HLH domains, which 
share a common helix-loop-helix amino acid structure. The bHLH region of 
the c-myc protein is one such dimerization domain. This domain is 
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complementary to itself and is therefore useful in the construction of chimeric 
peptides that form homodimers. 

An HLH dimerization domain in a protein can be identified by 
comparison of an amino acid sequence with that of ten known HLH 
5 dimerization domains (amino acids 336-393 in E12, 336-393 in E47, 554-613 

in daughterless, 357-407 in twist, 393-437 in human N-myc, 289-338 in 
human L-myc, 354-411 in human c-myc, 108-164 in MyoD, and genes of the 
achaete-scute locus: 101-167 of T4, 26-95 of T5 (Murre, C. et al. t Cell 
56:777-783 (1989)). The HLH dimerization domain contains two amphipathic 

10 helices separated by an intervening loop. The first helix contains 12 amino 

acids and the second helix contains 13 amino acids. Certain amino acids 
appear to be conserved in the HLH format, especially the hydrophobic 
residues which are present in the helices. Comparisons of the two sequences 
named above shows that there are five virtually identical hydrophilic residues 

15 within the 5' end of the homologous region and a set of mainly hydrophobic 

residues located in two short segments that are separated form one another by 
a sequence that generally contains prolines or clustered glycines. 

Another class of known dimerization domains are the leucine zipper 
domains. This domain is typically about 35 amino acids long and contains a 

20 repeating heptad array of leucine residues and an exceedingly high density of 

oppositely charged amino acids (acidics and basics) juxtaposed in a manner 
suitable for intrahelical ion pairing. It is thought that the leucines extending 
from the helix of one polypeptide interdigitate with those of the analogous 
helix of a second peptide (the partner) and form the interlock termed the 

25 leucine zipper. 

The DNA binding domain and the dimerization domain are engineered 
into the fusion protein in a manner which does not destroy the function of 
either domain; that is, the DNA binding domain, when properly dimerized, 
can recognize the DNA element to which it naturally binds and the 

30 dimerization domain retains the ability to dimerize with its partners. One of 
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skill in the art, by running control assays, will be able to establish that the 
fusion protein functions in the proper manner. 

The DNA sequence encoding the fusion protein may be chemically 
constructed or constructed by recombinant means known in the art. Methods 
5 of chemically synthesizing DNA are well known in the art {Oligonucleotide 

Synthesis, A Practical Approach, M.J. Gail, ed., IRL Press, Washington, 
D.C., 1094; Synthesis and Applications of DNA and RNA, S.A. Narang, ed., 
Academic Press, San Diego, CA, 1987). Because the genetic code is 
degenerate, more than one codon may be used to construct the DNA sequence 
10 encoding a particular amino acid (Watson, J.D. , In: Molecular Biology of the 

Gene, 3rd edition, W.A. Benjamin, Inc., Menlo Park, CA, 1977, pp. 356- 
357). 

To express the recombinant fusion constructs of the invention, 
transcriptional and translational signals recognizable by the host are necessary. 

15 A cloned fusion protein gene, obtained through the methods described above, 

and preferably in a double-stranded form, may be operably-linked to sequences 
controlling transcriptional expression in an expression vector, and introduced, 
for example by transformation, into a host cell .to produce the recombinant 
fusion proteins, or functional derivatives thereof, for use in the methods of the 

20 invention. 

Transcriptional initiation regulatory signals can be selected which allow 
for repression or activation of the expression of the gene encoding the fusion 
protein, so that expression of the fusion construct can be modulated, if 
desired. Of interest are regulatory signals which are temperature-sensitive so 

25 that by varying the temperature, expression can be repressed or initiated, or 

are subject to chemical regulation, for example, by a metabolite or a substrate 
added to the growth medium. Alternatively, the fusion construct may be 
constitutively expressed in the host cell. 

It is necessary to express the proteins in a host wherein the ability of 

30 the protein to retain its biological function is not hindered. Expression of 
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proteins in bacterial hosts is preferably achieved using prokaryotic regulatory 
signals. 

Expression vectors typically contain discrete DNA elements such as, 
for example, (a) an origin of replication which allows for autonomous 
replication of the vector, or elements which promote insertion of the vector 
into the host's chromosome in a stable manner, and (b) specific genes which 
are capable of providing phenotypic selection in transformed cells. Many 
appropriate expression vector systems are commercially available which are 
useful in the methods of the invention. 

Once the vector or DNA sequence containing the construct(s) is 
prepared for expression, the DNA construct(s) is introduced into an 
appropriate host cell by any of a variety of suitable means, for example by 
transformation. After the introduction of the vector, recipient cells are grown 
in a selective medium, which selects for the growth of vector-containing cells. 
Expression of the cloned gene sequence(s) results in the production of the 
fusion protein. 

If the fusion protein DNA encoding sequence and an operably-linked 
promoter is introduced into a recipient host cell as a non-replicating DNA (or 
RNA) molecule, which may either be a linear molecule or, more preferably, 
a closed covalent circular molecule which is incapable of autonomous replica- 
tion, the expression of the fusion protein may occur through the transient 
expression of the introduced sequence. 

Genetically stable transformants may be constructed with vector 
systems, or transformation systems, whereby the fusion protein DNA is 
integrated into the host chromosome. Such integration may occur de novo 
within the cell or be assisted by transformation with a vector which 
functionally inserts itself into the host chromosome, for example, with 
bacteriophage, transposons or other DNA elements which promote integration 
of DNA sequences in chromosomes. 

Cells which have been transformed with the fusion protein DNA 
vectors of the invention are selected by also introducing one or more markers 



WO 94/09133 



-15- 



PCT/US93/09634 



which allow for selection of host cells which contain the vector. Markers 
incorporated in the vector may provide, for example, biocide resistance, e.g., 
resistance to antibiotics, or the like. 

The transformed host cell can be fermented according to means known 
5 in the art to achieve optimal cell growth, and also to achieve optimal 

expression of the cloned fusion protein sequence fragments. Optimal 
expression of the fusion protein is expression which provides no more than the 
same moles of fusion protein subunit as the moles of the partner protein which 
are being expressed. However, variations in this amount are acceptable if they 

10 do not prevent the partner from forming heterodimers with the fusion protein, 

thereby interfering with fusion protein homodimer activity. 

Any protein that possesses a binding domain which can form a 
heterodimer with the fusion protein will impair or prevent the formation of 
fusion protein homodimers. Such proteins can thus be identified by their 

15 ability to interfere with the phenotype conferred by the fusion protein 

homodimer. 

In one embodiment the bacterial host, which is expressing a fusion 
protein as described above, is transformed with a \ expression library capable 
of expressing cloned eukaryotic genes. Those cells transformed with a 

20 eukaryotic gene expressing a protein which is a partner of the fusion protein 

can then be detected due to loss of the phenotype conferred by the fusion 
partner homodimer. 

\gtl 1 packaging systems for the creation of expression libraries from 
mRNA, which are useful in the methods of the invention, are known in the art 

25 and may be obtained commercially (for example, through Promega 

Corporation, Madison, Wisconsin). Further, custom genomic expression 
libraries may also be obtained commercially. Using the commercial kits, an 
oligo(dT)-primed cDNA library in Xgtll may be generated with the use of 
cytoplasmic poly(A)-containing mRNA from any desired mammalian source. 

30 To induce expression of the cloned proteins contained therein, 10 mM IPTG 

(isopropyl-thiogalactoside) may be added. 
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A particular advantage of the method of the invention for the 
identification of protein partners is that, where approximately equal amounts 
of the fusion protein(s) and the protein partner are present in the host cell, the 
partner which is identified will have a higher affinity for the fusion protein(s) 
5 than the fusion protein(s) has to itself. If the disrupted dimerization is normally 

associated with a biological activity, such a protein partner is highly likely to 
be an important regulator of that biological activity. Further, the partner 
which is identified is already in a cloned, expressing form which may be 
utilized to obtain larger quantities of the protein for its isolation and further 

10 characterization by protein and molecular biology techniques known in the art. 

Utilizing the above techniques, a chimeric peptide containing the bHLH 
dimerization region of c-myc and the DNA binding domain of cl was 
constructed (see Example 1). In the appropriate host cell, this chimeric 
peptide formed homodimers and repressed expression of the lacZ gene under 

15 the control of a lambda PL promoter and repressed phage lysis (see Example 

2). Introduction of a partner protein into the host cell interfered with 
homodimer formation and de-repressed expression of the lacZ gene (see 
Example 3). The inventors used this method to screen a cDNA expression 
library and discovered a specific partner protein which associates with c-myc 

20 in vivo (see Example 3). 

Compounds which inhibit the ability of protein partners to form 
interfering heterodimers, but which do not interfere with homodimer 
formation, may be identified by screening for the ability of a compound to 
reverse the interfering effect of the heterodimers and restore the homodimer- 

25 conferred phenotype. 

For example, for partners identified by de-repression of the lacZ gene 
as described above (see also Example X), compounds which prevent or 
otherwise interfere with heterodimer formation of the protein partners can be 
identified by screening for the ability of such compounds to restore repression 

30 of the lacZ gene and cause partner-containing cells to remain white when 

grown on X-gal plates. A compound which is found to restore lacZ gene 
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repression in this example would be a compound which (a) prevents the fusion 
protein from associating with the partner peptide which is also being expressed 
in the host, (b) does not prevent homodimer formation and (c) does not inhibit 
cell growth. 

The methods of the invention can be used to screen compounds in their 
pure form, at a variety of concentrations, and also in their impure form. The 
methods of the invention can also be used to identify the presence of such 
inhibitors in crude extracts, and to follow the purification of the inhibitors 
therefrom. The methods of the invention are also useful in the evaluation of 
the stability of the inhibitors identified as above, to evaluate the efficacy of 
various preparations. 

Analogs of such compounds which are more permeable across bacterial 
host cell membranes may also be used. For example, dibutyryl derivatives 
often display an enhanced permeability. 

Partners, and compounds which inhibit the association of such partners, 
of any type of transcriptional regulation protein which associates into dimers 
may be identified by the bacterial methods of the invention. The methods of 
the invention can also be used to identify partners, and compounds which 
interfere with such partners, of membrane-localized and/or cytoplasmically- 
localized proteins which associate into dimers. 

It may be desired to further characterize the partner proteins of c-myc 
which are identified by the methods of the invention in a eukaryotic expression 
system. Such characterization may be performed according to the methods 
described in the inventor's copending U.S. patent application entitled "C-Myc 
Screening Assays," Serial 07/785,567 filed Oct. 30, 1991 and incorporated 
herein by reference. 

The following examples further describe the materials and methods 
used in carrying out the invention. The examples are not intended to limit the 
invention in any manner. 
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Examples 

Example 1 
Construction of cl/c-myc Fusion Proteins 

Chimeric genes capable of expressing fusion proteins containing the 
5 DNA binding domain of the lambda repressor cl and either 1) the c-myc basic 

helix-loop-helix (bHLH) dimerization domain or 2) the c-myc bHLH and 
leucine zipper (LZ) dimerization domains were constructed. 

The promoter/operator region used consists of the /3-lactamase 
promoter, lac operator and Shine-Delgarno (S.D.) sequence. The sequence 
10 is as follows: 

GGA TCC TCT AAA TAC ATT CAA ATA AGT ATC CGC TCA TGA 

BamHl - 3 5 

6AC AAT AAC GGT AAC CA G AAT TGT GAG CGC TCA CAA TTT TG 

-10 BstEll 



15 



ATC GAT AGO AAA CTC GAG ATG. . . [Seq . ID NO 6] 

Clal S.D. Xhol +1 cl 



The N-terminal 336 bp (112 amino acids) of c/, which contains the 
DNA binding domain of this protein, was incorporated into this construct. 
This portion was amplified for cloning using polymerase chain reaction with 
20 primers adding Xhol and Xbal sites on the 5' and 3' ends, respectively. The 

promoter/operator and cl DNA were cloned into pUC18 digested with BamHl 
and Xbal to generate pUC3c/. 

The sequence around the Xbal site is as follows: 

5 " CAG GCA GGG TCT AG A , . . [SEQ ID NO 71 

25 Gin Ala Gly Xbal 

cl coding seq. 

The bHLH/LZ and bHLH fragments of c-myc were generated by PCR 
using a human c-myc cDNA as a template. The bHLH/LZ fragment used was 
a 258 bp fragment synthesized with primers starting at sites #2 and #9 (Figure 
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1) with Xbal and Sail sites added at the 5" and 3' ends, respectively. The 
bHLH fragment used is a 165 bp fragment with Xbal and Pstl sites added on 
the 5' and 3' ends, respectively. The boundaries of bHLH are at sites marked 
#2 and #10 (Figure 1). The primer used at site #10 included a termination 
codon, as does that used at site #9. Insertion of the c-myc sequences into 
pU3c/ was at the restriction sites corresponding to those added by the 
indicated PCR primers. The resulting constructs containing c-myc bHLH/LZ 
and bHLH were referred to as pU3.29 and pU3.210, respectively. As a result 
of the cloning procedure used, an Xbal site (TCT AG A) encoding amino acids 
Serine and Arginine was incorporated in-between the cl and c-myc sequences. 

The chimeric c//c-myc gene constructs in pUC18 were subcloned into 
pACYC177 (Chang, A.C.Y. et al 9 J. BacterioL 134: 1141-1156 (1978)) as 
follows. Both chimeric genes were excised from pUC18 by digestion with 
Hindlll, fill-in of the Hindlll overlap with Klenow, and subsequent BamHl 
digestion. The chimeric gene fragments were then cloned into pACYC177 
digested with BgR (filled in with Klenow) and BamHl. The resulting 
constructs were designated pYC188 which contains c/-bHLH/LZ and pYC192 
which contains c/-bHLH. These pYC-constructs confer kanamycin resistance 
upon transformed E. coli host cells and are normally maintained in low copy 
number (5-20 copies/cell). 



Example 2 

Assaying transformed bacteria for the phenotype 
conferred by the ch f c-myc fusion protein in homodimer form 



The DNA binding domain of the cl protein must be present in dimer 
form to function as a repressor of lambda transcription/infection. Native cl 
protein is unable to form dimers at physiological levels and is therefore 
functionally inactive. In contrast, fusion proteins containing a functional DNA 
binding domain from cl and a functional dimerization domain from c-myc 
should be able to form functional homodimer repressors. To detect the 
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repressor phenotype in bacterial cells transformed with the c//c-myc fusion 
constructs described in Example 1, two different assays were used. 

In the "dot plaque assay" (DPA), transformed E. coli cells were tested 
for susceptibility to lambda phage infection. These cells were predicted to be 
5 resistant to infection if the c//c-myc fusion protein was adequately expressed 

and formed functional homodimers. Cell strains carrying fusion protein 
constructs were grown in L-broth media containing 30 /ig/ml Kanamycin, 10 
mM MgS0 4 , and 0.2% maltose at 37°C. 0.25-0.5 ml of culture at an OD^ 
of 1.0 to 2.0 was added to 3mls of 48°C top agar, mixed by vortexing and 

10 plated on pre-warmed L-broth/Kanamycin plates. The top agar was allowed 

to solidify for 2-3 min. at room temperature and then 5fxl aliquots of lambda 
phage KH54 (provided by J. Hu and R. Sauer of the Massachusetts Institute 
of Technology) of titer SxlO^SxlO 6 plaque forming units (pfu) were dotted 
onto the top agar. Lambda phage KH4i 434 (provided by J. Hu and R. Sauer 

15 of the Massachusetts Institute of Technology), which carries the immunity 

region of phage 434 and is therefore not affected by lambda cl y was also 
dotted on as a control. Phage aliquots were allowed to dry and then the plates 
were incubated overnight at 37°C. 

In this assay, the titer of phage required to create a clear spot is used 

20 as a measure of phage resistance. Bacteria that express native cl protein 

(which is unable to form dimers) from pACYC177 (Chang, A.C.Y. et aL, J. 
Bacteriol. 134: 1141-1156 (1978)) are not resistant and clearing can be seen 
at < 10 2 pfu. In contrast, bacterial strains containing pYC188 and expressing 
the c/-bHLH/LZ fusion protein are resistant up to 10M0 6 pfu and those 

25 containing pYC192 and expressing the c/-bHLH are resistant up to 10 7 pfu. 

This resistance demonstrates the ability of the c//c-myc fusion proteins to 
dimerize and effectively repress phage transcription/infection. 

In the second assay, referred to herein as the X-gal assay, cells 
transformed with the c//c-myc fusion construct pYC192 also contained a 

30 chimeric lacZ gene under the control of the lambda P L promoter. In these 

host cells, expression of functional c//c-myc fusion protein would be expected 



BNSDOCID: <WO 9409133A1_I_> 



WO 94/09133 



PCI7US93/09634 



-21- 

to repress lacZ expression via repression of the lambda P L promoter. 
Expression of the lacL gene is easily detectable by growth of cells on X-gal 
containing media. Cells expressing lacL become blue on this media while 
nonexpressing or poorly expressing cells become white or pale blue, 
5 respectively. 

As expected, those cells transformed with pYC192 grew as white or 
pale blue colonies due to repression of the P L -/<2cZ gene while nontransformed 
cells grew as blue colonies due to expression of the P h ~lacZ gene. 

Example 3 

10 Screening a cDNA expression library for protein partners 

able to form heterodimers with the clfc-myc fusion protein 

Interference with dimerization by direct protein-protein interaction 
between the dimerization domain of the chimeric repressor and a cDNA- 
encoded protein is the basis for the screening system of the invention. Upon 
dimerization of a repressor monomer with a heterologous protein partner, 
which is not part of a cl fusion, the repressor chimera will be inactivated, as 
it is unable to bind DNA as a monomer. 

The dot plaque assay (DP A) and X-gal assay, described in Example 2, 
were used in the screening system. Bacteria expressing c//c-myc fusion 
proteins and exhibiting the homodimer conferred repression phenotype (either 
phage resistance or repression of P L -7<zcZ expression) were used. 

Screening with the Dot Plaque Assay 

For the DPA, E. coli strain Y1090 (available from Promega Corpora- 
tion, Madison, Wisconsin) expressing the chimeric repressor c//c-myc, which 
25 are resistant to infection by Xgtll, were used. Using this bacterial strain, 

Xgtl 1 phage cDNA libraries, expressing cDNA encoded proteins as C-terminal 
fusions with lacZ, were screened. Only those phage containing a cDNA 



15 



20 
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encoding a protein partner were predicted to form plaques due to interference 
with c//c-myc repressor/homodimer formation. 

Xgtll libraries were screened as follows. Bacterial strain Y1090 
containing pYC192 and expressing c/-mycbHLH was grown in L-broth media 
containing Kanamycin, Mg 2+ , and maltose essentially the same as for the DPA 
described in Example 2. 0.6 mis of culture were exposed to lxlOVSxlO 6 pfu 
of Xgtl 1 library phage in liquid for 20 min. at 30°C and then mixed with 7 
mis of top agar and poured on 150 mm L-broth/Kanamycin plates. Plates 
were incubated overnight at 42°C. Four cDNA libraries were screened: one 
from HeLa cells, one from T cell line EL4, one from the pre-B cell line 38B9, 
and one from primary tonsil cells which are almost exclusively B cells. These 
libraries were respectively obtained from T. Kadesch at the University of 
Pennsylvania, K. Georgopolas of Massachusetts General Hospital, D. Weaver 
of the Dana Farber Cancer Institute (DFCI), and T. Tedder of DFCI. 

The lambda cDNA libraries were also plated onto a bacterial strain 
expressing a chimeric repressor from plasmid PJH370 (Hu, J.C. et al. f 
Science 250: 1400-1403 (1990)) containing the cl DNA binding domain and the 
leucine zipper dimerization domain of the yeast transcription factor GCN4, 
For initial screenings, this strain acted as a comparison control to provide a 
baseline number of false positive plaques to be expected from the libraries 
resulting from phage mutations rendering them insensitive to the cl repressor. 
This strain also acted as a control for subsequent screening of putative positive 
plaques to determine if interference was due to a specific interaction with the 
c//c-myc fusion protein or if the interference was of a more general nature, 
affecting the GCN4 dimerization domain as well. 

For all libraries screened, essentially equal numbers of plaques were 
observed with the strain containing c/-GCN4 vs. the strain containing pYC192 
(c/-myc), indicating that these plaques were likely to be false positives. The 
number of plaques obtained varied from 5 to approximately 250, depending 
on the library used. Ninety phage which formed plaques on the strain 
containing pYC192 were plaque purified and subsequently screened on the c/- 
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GCN4 containing strain. All these phage again formed plaques, indicating that 
they did not specifically interact with the c//c-myc fusion protein. 

In light of these results, a subsequent experiment was performed to 
determine if a known protein partner could be detected with this screeening 

5 procedure. In this experiment a bacterial strain expressing a c//c-myc fusion 

protein was challenged with a Xgtl 1 phage expressing a Max cDNA. Max is 
a bHLH/LZ protein known to interact with c-myc. The challenged cells 
exhibited full resistance to the Max Xgtl 1 phage. 

In contrast to these results, DPA screening for a predicted protein 

10 partner introduced before phage infection was succesful. In this experiment, 

a pUC18 plasmid capable of expressing a protein containing the bHLH/LZ 
domains of c-myc, but not the DNA binding domain of cl y was introduced into 
a bacterial strain which already contained a pACYC177 plasmid capable of 
expressing a cl/c-myc fusion protein. The protein containing the bHLH/LZ 

15 domains was predicted to function as a partner to the cl I c-myc fusion protein 

and interfere with the repression of phage infection. As predicted, cells 
expressing c//c-myc and the bHLH/LZ protein were approximately 100-fold 
less resistant to phage infection than cells expressing c//c-myc only, as 
measured by the DPA. 

20 These results indicate that the DPA can be used to screen for protein 

partners, but that the protein partner must be expressed in the bacteria before 
it is challenged with phage. Simultaneous introduction of the protein partner 
gene with the challenging phage, as occurs in the direct screen, probably does 
not work because the phage is effectively repressed before the protein partner 

25 gene is given a chance to express and interfere with the c//c-myc fusion 

protein repressor. 

Screening with the X-Gal Assay 

As described above, in cells with an active c//c-myc repressor the P L - 
lacZ gene is turned off resulting in the generation of white colonies on X-gal 
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indicator plates. Interference with repressor dimerization is predicted to yield 
blue colonies as the lacZ gene would be expressed (de-repressed). 

Screening using the X-gal assay was performed as follows. The strain 
Y1090 was transformed with the lacZ target plasmid pNNP L 387 and pYC188 
5 which expresses c/-bHLH/LZ. The plasmid pNNP L 387 was constructed by 

inserting a PCR generated DNA fragment containing the left promoter of 
phage lambda upstream of lacL in pNN387 (provided by S. Elledge, Baylor 
University, See Elledge, S.J. et aL, Genes & Develop. 3: 185-197 (1989)). 
These cells, when plated onL-broth/Kanamycin/Chloramphenicol/X-gal, form 

10 white to pale blue colonies. This strain, referred to as 10B18, was made 

competent for electroporation (see Current Protocols in Molecular Biology , sec. 
1.8.4, Wiley Interscience, ed. by Ausubel et al (1987)) and transformed with 
a plasmid-based cDNA library made from human peripheral blood 
lymphocytes which had been transformed with Epstein-Barr Virus (provided 

15 by S. Elledge, See Elledge, S.J. et aL, Proc. Natl. Acad. Set. USA 88: 1731- 

1735 (1991)). There were about 10 7 recombinants in this once-amplified 
library. 

10B18 was electroporated on two separate occasions with 500 ng of 
library DNA. Cells were allowed to recover from electroporation for 45 min. 

20 at 37°C in SOC media and then plated on M9/0.2% mannitol with 

Chloramphenicol (20/xg/ml), Kanamycin (30/ig/ml), IPTG (2mM), Ampicillin 
(50/zg/ml), and X-gal (0.004%). The electroporations yielded 2.8xl0 6 and 
5.6x10 s transformants, of which approximately 500 and 29, respectively, were 
blue. A total of 322 blue colonies were picked and restreaked to isolate single 

25 colonies. From these, 97 blue clonal colonies were isolated and plasmid 

DNAs were prepared. Plasmid DNA from each clone was then retransformed 
into 10B18 and plated as above. Only one clone consistently produced blue 
colonies. 

This clone was shown to be specific for c-myc by comparing the 
30 phenotypes it produced in different repressor chimera backgrounds. Bacterial 

strains similar to 10B18 which contain different c/-dimerization domain fusion 
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constructs were used. These strains express cl fusions with the c-myc bHLH 
domain (10B19), the transcription factor E2/5bHLH domain (10BE2/5), or 
thyroid hormone receptor (3 (10B/3). The positive clone isolated in the original 
screen produced blue colonies only in 10B18 and 10B19, where dimerization 
5 was mediated by a c-myc domain. Strains 10BE2/5 and 10B)3 remained white 

on X-gal plates after transformation with this clone. 

The high number of false positives obtained during the initial rounds 
of screening could be due to the instability of the plasmid containing the 
chimeric repressor gene in the screening strain. Alternatively, blue colonies 
10 could result from an increase in the copy number of the P L -lacZ containing 

plasmid or increased expression of the P L -/acZ gene which titrates out 
repressor dimers. Whatever the cause, repeated passages through the 10B18 
strain was effective in screening out false positives. 

Example 4 

15 Identification of compounds which prevent 

c-myc partner formation 

To identify compounds which inhibit c-myc partner heterodimerization 
without interfering with c-myc homodimerization, cells identified according to 
the method described in Example 3 which contain the c//c-myc fusion protein 
20 and a partner protein are used along with cells containing only the c//c-myc 

fusion protein as described in Example 2. These cells are further exposed to 
experimental compounds W, X, Y, and Z and the effect of such compounds 
on the homodimer/heterodimer dependent phenotype is determined. 

Typical results from such an experiment are shown in Table 1. 
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Table 1: Identification of C-myc-protein Partner Inhibitors 


Compound 


Protein Partner 


Assay Phenotype 


none 




no plaques/white 




+ 


plaques/blue 


W 




no plaques/white 






plaques/blue 


X 




plaques/blue 




+ 


plaques/blue 


Y 




no plaques/white 






no plaques/white 



The results of the above table indicate that, in the absence of the 
partner protein, compound W had no effect on the ability of the c//c-myc 
protein to form homodimers and exhibit the corresponding phenotype. 
Compound W also had no effect on the ability of the partner to form 
heterodimers with the myc fusion protein and reverse the homodimer- 
conferred phenotype. Therefore, compound W will not be a compound of 
interest. 

Compound X interfered with homodimer formation and therefore will 
not be a compound of interest. 

Compound Y is an inhibitor of heterodimer formation. Compound Y 
did not interfere with homodimer formation but did interfere with heterodimer 
formation. Therefore, compound Y is a compound of interest as it may 
disrupt c-myc action in vivo. 

AH references cited herein are fully incorporated by reference. Having 
now fully described the invention, it will be understood by those with skill in 
the art that the scope may be performed within a wide and equivalent range 
of conditions, parameters and the like, without affecting the spirit or scope of 
the invention or any embodiment thereof. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Kingston, Robert E . 

Bunker, Christopher Alden 

5 (ii) TITLE OF INVENTION: Protein Partner Screening Aesaye and 

Uses Thereof 

(iii) NUMBER OF SEQUENCES: 7 

(iv) CORRESPONDENCE ADDRESS: 

10 (A) ADDRESSEE: Sterne, Kessler, Goldstein and Fox 

(B) STREET: 1100 New York Avenue, N.W. ; Suite 600 

(C) CITY: Washington 

(D) STATE: D.C. 

(E) COUNTRY: U.S.A. 
15 (F) ZIP: 20005-3934 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

20 (D) SOFTWARE: Patent In Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT (to be assigned) 

(B) FILING DATE: (herewith) 

(C) CLASSIFICATION: 

25 (viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Cimbala, Michele A. 

(B) REGISTRATION NUMBER: 33,651 

<C) REFERENCE/DOCKET NUMBER : 0609.274PC03 

(ix) TELECOMMUNICATION INFORMATION: 
30 (A) TELEPHONE: (202) 371-2600 

(B) TELEFAX: (202) 371-2540 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
40 GTCAAGATGG C 

(2) INFORMATION FOR SEQ ID NO: 2: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA 



(i) 

45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
AG CAGCTGG C 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

5 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GTCATGTGGC 10 
10 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1419 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
15 (D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

AGGAGGAACA AGAAGATGAG GAAGAAATCG ATGTTGTTTC TGTGGAAAAG AGGCAGGCTC 6 0 

CTGG CAAAAG GTCAGAGTCT GGATCACCTT CTGCTGGAGG CCA CAGGAAA CCTCCTCACA 12 0 

20 GCCCACTGGT CCTCAAGAGG TGCCACGTCT CCACACATCA GCACAACTAC GCAGCGCCTC 18 0 

CCTCCACTCG GAAGGACTAT CCTGCTGCCA AGAGGGTCAA GTTGGACAGT GTCAGAGTCC 24 0 

TGAGACAGAT CAGCAACAAC CGAAAATGCA CCAGCCCCAG GTCCTCGGAC ACCGAGGAGA 300 

ATGTCAAGAG GCGAACACAC AACGTCTTGG AGCGCCAGAG GAGGAACGAG CTAAAA CGG A 36 0 

GCTTTTTTGC CCTGCGTGAC CAGATCCCGG AGTTGGAAAA CAATGAAAAG GCCCCCAAGG 420 

25 TAGTTATCCT TAAAAAAGCC ACAGCATACA TCCTGTCCGT CCAAG CAG AG GAG CAAAAG C 4 80 

TCATTT CTGA AGAGGACTTG TTGCGGAAAC GACGAGAACA GTTGAAACAC AAACTTGAAC 54 0 

AGCTACGGAA CTCTTGTGCG TAAGGAAAAG TAAGGAAAAC GATTCCTTCT AACAGAAATG 6 00 

TCCTGAGCAA TCACCTATGA ACTTGTTTCA AATG CATGAT CAAATGCAAC CTCACAACCT 66 0 

TGGCTGAGTC TTGAGACTGA AAGATTT AG C CATAATGTAA ACTGCCTCAA ATTGGACTTT 720 

30 GGGCATAAAA GAACTTTTTT ATG CTTACCA TCTTTTTTTT TTCTTTAACA GATTTGTATT 78 0 

TAAGAATTGT TTTTAAAAAA TTTTAAGATT TACACAATGT TTCTCTGTAA ATATTG CCAT 84 0 

TAAATGTAAA TAACTTTAAT AAAACGTTTA TAG CAGTTA C A CAG AATTTC AATCCTAGTA 900 

TATAGTACCT AGTATTATAG GTACTATAAA CCCTAATTTT TTTTATTTAA GTACATTTTG 96 0 

CTTTTTAAAG TTGATTTTTT TCTATTGTTT TTAGAAAAAA TAAAATAACT GGCAAATATA 102 0 

35 TCATTGAGCC AAATCTTAAG TTGTGAATGT TTTGTTTCGT TTCTTCCCCC TCCCAACCAC 108 0 

CACCATCCCT GTTTGTTTTC ATCAATTGCC CCTTCAGAGG GTGGTCTTAA GAAAGGCAAG 114 0 

AGTTTTCCTC TGTTGAAATG GGTCTGGGGG CCTTAAGGTC TTTAAGTTCT TGGAGGTTCT 1200 

AAGATGCTTC CTGGAGACTA TGATAACAGC CGAAGTTGAC AGTTAGAAGG AATGG CAGAA 126 0 

GGCAGGTGAG AAGGTGAGAG GTAGGCAAAG GAGATACAAG AGGTCAAAGG TAG CAGTTAA 1320 

40 GTACACAAAG AGG CATAAGG ACTGGGGAGT TGGGAGGAAG GTGAGGAAGA AACTCCTGTT 138 0 

ACTTTAGTTA ACCAGTGCCA GTCCCCTGCT CACTCCAAA 1419 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 186 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: both 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Glu Glu Gin Glu Asp Glu Glu Glu He Asp Val Val Ser Val Glu Lys 
15 io 



15 



Arg Gin Ala Pro Gly Lys Arg Ser Glu Ser Gly Ser Pro Ser Ala Glv 
20 25 30 

Gly His Ser Lye Pro Pro Hie Ser Pro Leu Val Leu Lys Arg Cyo Hie 



45 



i c Val Ser Thr Hie Gin His Asn Tyr Ala Ala Pro Pro Ser Thr Arg Lys 

13 50 55 60 

Asp Tyr Pro Ala Ala Lye Arg Val Lys Leu Asp Ser Val Arg Val Leu 

65 70 75 80 



Arg Gin He Ser Asn Asn Arg Lys Cys Thr Ser Pro Arg Ser Ser Asp 
85 90 c 



95 



Thr Glu Glu Asn Val Lys Arg Arg Thr His Asn Val Leu Glu Arg Gin 
100 105 no 

Arg Arg Asn Glu Leu Lye Arg Ser Phe Phe Ala Leu Arg Asp Gin He 
115 HO 125 

Pro Glu Leu Glu Asn Asn Glu Lys Ala Pro Lys Val Val He Leu Lye 
130 135 140 

Lys Ala Thr Ala Tyr He Leu Ser Val Gin Ala Glu Glu Gin Lys Leu 
145 150 155 160 

He Ser Glu Glu Asp Leu Leu Arg Lys Arg Arg Glu Gin Leu Lys His 
16 $ 170 175 

Lys Leu Glu Gin Leu Arg Asn Ser Cys Ala 
180 ias 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 
QC (A) LENGTH: 101 base pairs 

■33 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
GGATC CTCTA AATACATTCA AATAAGTATC CGCTCATGAG ACAATAACGG TAACCAGAAT 6 0 

TGTGAGCGCT CACAATTTTG ATCGATAGGA AACTCGAGAT G 101 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
CAGGCAGGGT CTAGA 15 



BNSDOC1D: <WO 9409133A1_I_> 



-31- 



WHAT IS CLAIMED IS: 

1. A method for identifying and classifying a protein partner 
wherein said method comprises: 

(a) transformation of a host cell with a genetic construct capable of 
expressing a fusion protein, wherein said fusion protein contains a DNA 
binding domain and a dimerization domain complementary to itself which is 
not naturally associated with said DNA binding domain, and wherein said 
fusion protein forms a homodimer which confers a detectable phenotype upon 
said host cell; 

(b) transformation of said host cell of part (a) with a genetic 
construct capable of expressing said protein partner; 

(c) culturing said host cell of part (b) under conditions which 
express said fusion protein and said protein partner, said protein partner being 
expressed at levels equivalent to or greater than said fusion protein; 

(d) determining whether the phenotype conferred by said fusion 
protein of part (a) is present in said host cell of part (c); and 

(e) classifying said protein partner on the basis of the presence or 
absence of said phenotype. 

2. A method of identifying and classifying a compound as an 
inhibitor of a protein partner, wherein said method comprises: 

(a) transformation of a bacterial host cell with a genetic construct 
capable of expressing a fusion protein, wherein said fusion protein contains a 
DNA binding domain and a dimerization domain complementary to itself 
which is not naturally associated with said DNA binding domain, and wherein 
said fusion protein forms a homodimer which confers a detectable phenotype 
upon said host cell; 

(b) transformation of said host cell of part (a) with a genetic 
construct capable of expressing said protein partner; 
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(c) culturing said host cell of part (b) in the presence of said 
compound and under conditions which express said fusion protein and said 
protein partner, said protein partner being expressed at levels equivalent to or 
greater than said fusion protein; 
5 (d) determining the ability of said compound to prevent protein- 

partner-induced interference of the phenotype conferred by said fusion protein 
of part (a); and 

(e) classifying said compound as an inhibitor of protein partner 
formation on the basis of the presence or absence of said phenotype. 

10 3. The method of any one of claims 1 or 2, wherein said phenotype 

conferred by said fusion protein in homodimer form is the repression of 
expression of an assayable marker gene. 

4. The method of claim 3, wherein said assayable marker is under the 
transcriptional control of the bacteriophage X PL promoter. 

5. The method of claim 4, wherein said assayable marker is the lacL 

gene. 

6. The method of any one of claims 1 or 2, wherein said DNA 
binding domain of said fusion protein is the DNA binding domain of 
bacteriophage X cl repressor protein. 

7. The method of claim 6, wherein said DNA binding domain of 
said cl repressor protein is the N-terminal 112 amino acids of said repressor 
protein. 

8. The method of any one of claims 1 or 2, wherein said 
dimerization domain is a bHLH domain. 

25 9. The method of claim 8, wherein said bHLH domain is from 

myc. 

10. The method of claim 9, wherein said myc is c-myc. 
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11. The method of claim 10, wherein said bHLH domain is the 
amino acid sequence bounded by site numbers 2 and 10 of Figure 1. 

12. The method of any one of claims 1 or 2, wherein said 
dimerization domain is a bZIP domain. 

5 13. A method for identifying and classifying a protein partner 

wherein said method comprises: 

(a) transformation of a bacterial host cell with a genetic construct 
capable of expressing a first fusion protein and a second fusion protein, 
wherein said first fusion protein contains a DNA binding domain and a first 

10 dimerization domain not naturally associated with said DNA binding domain, 

and wherein said second fusion protein contains said DNA binding domain and 
a second dimerization domain complementary to said first dimerization domain 
wherein said second dimerization domain is not naturally associated with said 
DNA binding domain, and wherein said first fusion protein and said second 

15 fusion protein form a DNA binding domain homodimer which confers a 

detectable phenotype upon said host cell; 

(b) transformation of said host cell of part (a) with a genetic 
construct capable of expressing said protein partner; 

(c) culturing said host cell of part (b) under conditions which 
20 express said first fusion protein, said second fusion protein, and said protein 

partner, said protein partner being expressed at levels equivalent to or greater 
than either said first fusion protein or said second fusion protein; 

(d) determining whether the phenotype conferred by said DNA 
binding domain homodimer of part (a) is present in said host cell of part (c); 

25 and 

(e) classifying said protein partner on the basis of the presence or 
absence of said phenotype. 
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14. A method of identifying and classifying a compound as an 
inhibitor of a protein partner, wherein said method comprises: 

(a) transformation of a bacterial host cell with a genetic construct 
capable of expressing a first fusion protein and a second fusion protein, 

5 wherein said first fusion protein contains a DNA binding domain and a first 

dimerization domain, and wherein said second fusion protein contains said 
DNA binding domain and a second dimerization domain complementary to 
said first dimerization domain, and wherein said first fusion protein and said 
second fusion protein form a DNA binding domain homodimer which confers 

10 a detectable phenotype upon said host cell; 

(b) transformation of said host cell of part (a) with a genetic 
construct capable of expressing said protein partner; 

(c) culturing said host cell of part (b) in the presence of said 
compound and under conditions which express said first fusion protein, said 

15 second fusion protein, and said protein partner, said protein partner being 

expressed at levels equivalent to or greater than either said first fusion protein 
or said second fusion protein; 

(d) determining the ability of said compound to prevent protein- 
partner-induced interference of the phenotype conferred by said DNA binding 

20 domain homodimer of part (a); and 

(e) classifying said compound as an inhibitor of protein partner 
formation on the basis of the presence or absence of said phenotype. 
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5001 

5051 

5101 

5151 

5201 

5251 

5301 

5351 

5401 

5451 
5501 



AGGAGGAACAAGAAGATGA 
luGluGluGlnGluAspGl 



GGAAGAAATCGATGTTGTTTCTGTGGAAAAGAGGCAGGCTCCTGGCAAAA 
uGluGluI I eAspVa I Va I SerVa I G I uLysArgG In A I aProG I yLysA 

GGTCAGAGTCTGGATCACCTTCTGCTGGAGGCCACAGGAAACCTCCTCAC 
rgSerG t uSerG I ySerProSer A laGlyG lyHl sSerLysProProH I s 

AGCCCACTGGTCCTCAAGAGGTGCCACGTCTCCACACATCAGCACAACTA 
SerProLeuVa ILeuLysArgCysH { sVa I SerThrH isGlnHi sAsn Ty 

CGCAGCGCCTCCCTCCACTCGGAAGGACTATCCTGCTGCCAAGAGGGTCA 
r A I a A 1 aProProSerThr ArgLysAspTyrProA I aA I aLysArgVa I L 

AGTTGGACAGTGTCAGAGTCCTGAGACAGATCAGCAACAACCGAAAATGC 
ysLeuAspSerVa lArgVa ILeuArgGln I I eSer Asn Asn ArgLysCys 

accagccccaggtcctcggacaccgaggagaatgtcIaagaggcgaacaca 

ThrSerProArgSerSerAspThrGluGluAsnValLysArgArgThrHl 

caacgtcttggagcgccagaggaggaacgagctaaaacggagcttttttg 

s Asn Va ILeuGl uArgG I n ArgAr gAsnG 1 uLeuLysArgSerPhePhe A 

. HI 

CCCTGCGTGACCAGATCCCGGAGTTGGAAAACAATGAAAAGGCCCCCAAG 
laLeuArgAspGln l I eProG I uLeuG I uAsn AsnG I uLysA I aProLys 



GTAGTTATCCTTAAAAAAGCCACAGCATACATCCTGTCCGTCCAAGCAGA 
Va I Va I II eLeuLysLysA 1 aThr A laTyrl I eLeuSerVa IG lnAlaGl 

. H2 #|o 

GJGAGCAAAAGCTCATTTCTGAAGAGGACTTGTTGCGGAAACGACGAGAAC 
cjGluGlnLysLeuI leSerGluGluAspLeuLeuArgLysArgArgGluG 



AGTTGAAACACAAACTTGAACAGCTACGGAACTCTTGTGCGTAA 
I n LeuLysH i sLysLeuG I uG I n LeuAr g Asn SerCysA I aEn d 

FIG. 1 
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GTAAGGAAAACGATTCCTTCTAACAGAAATGTCCTGAGCAATCACCTATG 
AACTTGTTTCAAATGCATGATCAAATGCAACCTCACAACCTTGGCTGAGT 
CTTGAGACTGAAAGATTTAGCCATAATGTAAACTGCCTCAAATTGGACTT 
TGGGCATAAAAGAACTTTTTTATGCTTACCATCTTTTTTTTTTCTTTAAC 

i < t • « 

agatttgtatttaagaattgtttttaaaaaattttaagatttacacaatg 
tttctctgtaaatattgccattaaatgtaaataacttt ^aTaaa] acgttt 



ATAGCAGTTACACAGAATTTCAATCCTAGTATATAGTACCTAGTATTATA 



5600 
5650 
5700 
5750 
5800 
5850 
5900 



5901 GGTACTATAAACCCTAATTTTTTTTATTTAAGTACATTTTGCTTTTTAAA 5950 



5951 GTTGATTTTTTTCTATTGTTTTTAGAAA AftATAAA| ATAACTGGCAAATAT 

6001 ATC ATTGAGCC AAATCTTAAGTTGTGAATGTTTTGTTTCGTTTCTTCCCC 

6051 CTCCCAACCACCACCATCCCTGTTTGTTTTCATCAATTGCCCCTTCAGAG 

61 01 GGTGGTCTTAAGAAAGGCAAGAGTTTTCCTCTGTTGAAATGGGTCTGGGG 

6151 GCCTT AAGGTCTTT AAGTTCTTGGAGGTTCTAAGATGCTTCCTGGAGACT 

62 0 1 ATGATAACAGCCGAAGTTGACAGTT AG AAGG AATGGCAGAAGGCAGGTGA 
625 1 GAAGGTGAGAGGTAGGCAAAGGAGATACAAGAGGTCAAAGGTAGCAGTTA 

63 0 1 AGTACACAAAGAGGCATAAGGACTGGGGAGTTGGGAGGAAGGTGAGGAAG 
635 1 AAACTCCTG7TACTTTAGTTAACCAGTGCCAGTCCCCTGCTCACTCCAAA 
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AGGAGGAACAAGAAGATGA 50OQ 
(uG luGluGInGluAspGl 
. - 

GGA.AGAAATCGATGTTGTTTCTGTGGAAAAGAGGCAGGCTCCTGGCAAAA 5050 
uG luG lull eAspVa i Va I SerVa. IG luLysArgG Ifl A 1 aProG lyLysA 

GGTCAbAGTCTGGATCACCTTCTGCTGGAGGCCACAGGAAACCTCCTCAC 5 1 00 
rgSer-G luSerG lySerProSerA UGlyG lyH IsSarLysProProH f s 

AGCC CACTGGTGCTCA AGAGGTGCCACGTCTCC AC ACATCAGC AC AACTA 
SerProLeuVa ILeuLysArgCysH f sVa ISer ThrH I sG lr,H t sAsnTy 

CGCAGCGCCTCCCTCCACTCGGAAGGACTATCCTGCTGCCAAGAGGGTCA 
r A I aA I aPraProSerThrArgLysAspTyrProA ( a A 1 alysArgYa IL 

AGTrGGACAGTGTCAGAGTCCTGAGACAGATDAGCAACAACCGAAAATGC 
ysLeutepSerVa lArgYa ILeuArgG In II eSerftsn AsnArgLysCys 

ACCAGaCCAGGTCCTCGGACACCGAGGAGAArGTdAAGAGGCGAACACA 
ThrSerProArgSerSerAspThrGluGIuAsnValLysArgArgThrKi 

CAACGTCTTGCAGCGCCAGAGGAGGAACGAGCTAAAACGGAGCTTTTTTG 
sAsn J UuG I uArgG t n Arg Ar pAsn G I uLe uL vsArgS g rPhe Phe A 

~ * Hf T 

CCCTGCGTGACC AGATCCCGGAGTTGGAAAACA ATGAA AAGG CCC CC AA G 
laLetiArpAspGlri l leProGluLeuQluAsriAsnGluLysA ItProLys 



GT AGTTATCCTT A AAA AAGCC ACAGCATACAT GCTGTCCGT CC AAGC AG A 
Ytt 1 Va t r IcLeuLysLysA < olhrA I ttTyrl LeLeuSerVa 1 G t nA l&G 1 
- . H2 . , #(5 

GpAGCAA AAGC TCATTTCTGAAGAGGACTTGTT GCGGA A ACGA CGAGAAC 
$j LuG I nLysUwI teSerG I uG U^AspLeuU^ArgLysArgArgG luG 

ACT TGAA ACAC AA ACTTG AACAGCTACGG AAC7 CT TGT GCGT AAtiGAAAA 
I rtLeuLysH 1 sLysLeuG I uG I nLeuArgAsn Se rCys A laEnd 



FIG. 1 
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GTAAGGAAAACGATTCCTTCTAACAGAAATGTCCTGAGCAATCACnATG 
AACTTGTTTCAAATGCATGATCAAATGCAftCCTCACAACCTTGGCTGACT 
C7TGAGACTGAAAGATTTAGCCATAATGTAAACTCCCTCAAATTGGACTT 
FGG GCA T AAAAGA ACTT TTT T A TGCT TACCA T CTTTT T7TT TTC TTTAAC 
AG ATT TGT ATTTA AGAATT GTTTTT A AA A AATTTT A AGATTT AC ACA ATG 



TTTCTCTGTAAATATTGCCATTAAATGTAAATAACTT TFiATAA^ CGTn 



5600 
5650 
5700 
5750 
5800 

5850 



ATAG CAGTTACACAGAATTTCAATCCTAGTATATAGTACCTAGTATTATA 59 00 



5901 GGT ACT ATA AACCCT AATTTTT TTT ATTT AAGT ACAT TTTGCTTTT T A AA 5950 



595i GTTGATTTTTTT CT ATT GTTTTTAG AAA/^ATAA^T AACT GGC AA AT AT 

6001 AT C ATTG AGCCAAATCT T AAGTTGTGAATGTT TTGTT TCGT TTCTTCCCC 

6 05 1 CTCCCAACCACCA CC ATCCCTGTTTGT TT TCATC A ATTGC C CCTTCAGA G 

61 01 GGTGGTCTTAAGAAAGGCAAGAGTTncCTCTGTTDAAATGGGTCIGGGG 

6 151 GCCTT A AGGTCT TT AAGTTCTTGGAGGT TCT AAGAT GCTT CETGG AGAC7 

6H01 ATGATAACAGCCGAAGTTGACAG7TADAAGGAATGGCAGAAGGCAGGTGA 

6 £5 1 G AAGGTGAG AGGT AGGC AAAGGAGAT AC A AGA DGTC AAAGGT AGCAGTT A 

6301 AGTACACAAAGAGGCATAAGGACTGGGGAGTTGGGAGGAAGGTGAGGAAG 

6351 AAACTCCTGTTACTTTAGTTAACCAGTGCCAGTCCCCTGCTCACTCtAAA 
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FIG. 1 COnt. 



BNSDOCIO: <WO 94091 33A1TI_> 



